Search CORE

26 research outputs found

Level Playing Field for Million Scale Face Recognition

Author: Kemelmacher-Shlizerman Ira
Nech Aaron
Publication venue
Publication date: 30/04/2017
Field of study

Face recognition has the perception of a solved problem, however when tested at the million-scale exhibits dramatic variation in accuracies across the different algorithms. Are the algorithms very different? Is access to good/big training data their secret weapon? Where should face recognition improve? To address those questions, we created a benchmark, MF2, that requires all algorithms to be trained on same data, and tested at the million scale. MF2 is a public large-scale set with 672K identities and 4.7M photos created with the goal to level playing field for large scale face recognition. We contrast our results with findings from the other two large-scale benchmarks MegaFace Challenge and MS-Celebs-1M where groups were allowed to train on any private/public/big/small set. Some key discoveries: 1) algorithms, trained on MF2, were able to achieve state of the art and comparable results to algorithms trained on massive private sets, 2) some outperformed themselves once trained on MF2, 3) invariance to aging suffers from low accuracies as in MegaFace, identifying the need for larger age variations possibly within identities or adjustment of algorithms in future testings

arXiv.org e-Print Archive

Crossref

Soccer on Your Tabletop

Author: Curless Brian
Kemelmacher-Shlizerman Ira
Rematas Konstantinos
Seitz Steve
Publication venue
Publication date: 03/06/2018
Field of study

We present a system that transforms a monocular video of a soccer game into a moving 3D reconstruction, in which the players and field can be rendered interactively with a 3D viewer or through an Augmented Reality device. At the heart of our paper is an approach to estimate the depth map of each player, using a CNN that is trained on 3D player data extracted from soccer video games. We compare with state of the art body pose and depth estimation techniques, and show results on both synthetic ground truth benchmarks, and real YouTube soccer footage.Comment: CVPR'18. Project: http://grail.cs.washington.edu/projects/soccer

arXiv.org e-Print Archive

Crossref

DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion

Author: Holynski Aleksander
Karras Johanna
Kemelmacher-Shlizerman Ira
Wang Ting-Chun
Publication venue
Publication date: 14/04/2023
Field of study

We present DreamPose, a diffusion-based method for generating animated fashion videos from still images. Given an image and a sequence of human body poses, our method synthesizes a video containing both human and fabric motion. To achieve this, we transform a pretrained text-to-image model (Stable Diffusion) into a pose-and-image guided video synthesis model, using a novel finetuning strategy, a set of architectural changes to support the added conditioning signals, and techniques to encourage temporal consistency. We fine-tune on a collection of fashion videos from the UBC Fashion dataset. We evaluate our method on a variety of clothing styles and poses, and demonstrate that our method produces state-of-the-art results on fashion video animation. Video results are available on our project page.Comment: Project page: https://grail.cs.washington.edu/projects/dreampose

arXiv.org e-Print Archive